Approximate Privacy: PARs for Set Problems 



Joan Feigenbaum* Aaron D. Jaggard^ 

Department of Computer Science DIMACS 

Yale University Rutgers University 

j oan . f eigenbaumOyale . edu adj @dimacs . rutgers . edu 

Michael Schapira* 
Department of Computer Science 
Yale University 
and 

Computer Science Division 
University of California, Berkeley 
michael . s chap ira@y ale . edu 



Abstract 

In previous work (arXiv:0910.5714), we introduced the Privacy Ap- 
proximation Ratio (PAR) and used it to study the privacy of protocols 
for second-price Vickrey auctions and Yao's millionaires problem. Here, 
we study the PARs of multiple protocols for both the disjointness prob- 
lem (in which two participants, each with a private subset of {1, . . . , k}, 
determine whether their sets are disjoint) and the intersection problem 
(in which the two participants, each with a private subset of {1, . . . , k}, 
determine the intersection of their private sets). 

We show that the privacy, as measured by the PAR, provided by any 
protocol for each of these problems is necessarily exponential (in k). We 
also consider the ratio between the subjective PARs with respect to each 
player in order to show that one protocol for each of these problems is 
significantly fairer than the others (in the sense that it has a similarly bad 
effect on the privacy of both players) . 
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1 Introduction 



Widespread use of computers and networks in almost all aspects of daily life has 
led to a proliferation of sensitive electronic data records and thence to extensive 
study of privacy-preserving computation. One fruitful approach is based on the 
combinatorial characterization of privately computable functions put forth by 
Chor and Kushilevitz |4j and the subsequent communication-complexity analysis 
of privately computable functions by Kushilevitz . Using this approach, one 
can show, for example, that Yao's millionaires' problem [15] is not perfectly 
privately computable [4] and that the two-bidder, 2™ d -price Vickrey auction is 
perfectly privately computable but only at the cost of and exponential amount 
of communication by the bidders [3]. 

Motivated by the fact that functions of interest may not be perfectly pri- 
vately computable or may be so only by impractically costly protocols, we be- 
gan in [8] a communication-complexity-based investigation of approximate pri- 
vacy. We formulated both worst-case and average-case versions of the privacy- 
approximation ratio (PAR) of a function / in order to quantify the amount of 
privacy that can be preserved by a protocol that computes / and studied the 
tradeoff between approximate privacy and communication complexity in proto- 
cols for the millionaires' problem and the two-bidder, 2 nd -price Vickrey auction. 

Informally, a two-party protocol is perfectly privacy-preserving if the two 
parties (or a third party observing the communication between them) cannot 
learn more from the execution of the protocol than the value of the function 
the protocol computes. (This notion can be extended naturally to protocols 
involving more than two participants, but we do not consider the more gen- 
eral notion in this paper.) Chor and Kushilevitz [Hill] formalize this notion 
of privacy using the communication-complexity-theoretic notions of the ideal 
monochromatic regions of a function / and the monochromatic rectangles of a 
protocol P that computes /. Every two-input function / can be represented by 
a two-dimensional matrix A(f) in which A(f)( xl X2 } = f(x\,X2)- In the parti- 
tion of A(f) into the ideal monochromatic regions of /, the entries A(f)r xljX2 \ 
and A(f)r yiiV2 } are in the same region if and only if f(xi,xi) = /(j/ijjte); if / 
is perfectly privately computable, then there is a protocol P for / that parti- 
tions A(f) into a set of monochromatic rectangles that is exactly equal to the 
set of ideal monochromatic regions of /. For functions that are not perfectly 
privately computable, our notions of approximate privacy [8] quantify the worst- 
case and average-case ratios between the size of an ideal monochromatic region 
of / and the corresponding monochromatic rectangle in the partition induced 
by a maximally privacy-preserving protocol for /. 

In this paper, we apply our PAR framework to the intersection problem (in 
which party l's input is a set Si, party 2's input is a set S 2 , and the goal of the 
protocol is to compute Si f~l 52) and to its decision version disjointness (in which 
f{Si,S 2 ) = 1 if Si DS 2 = 0, and /(Si, £2) = otherwise). From both the 
privacy perspective and the communication-complexity perspective, these are 
extremely natural problems to study. The intersection problem has served as a 
motivating example in the study of privacy-preserving computation for decades; 
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in a typical application, two organizations wish to compute the set of members 
that they have in common without disclosing to each other the people who are 
members of only one of the organizations. The disjointness problem plays a 
central role in the theory and application of communication complexity, where 
the fact that n + 1 bits of communication are required to test disjointness of 
two subsets of {1, . . . , n} is used to prove many worst-case lower bounds. 

1.1 Our Findings 

In applying our PAR framework to the disjointness and intersection problems, 
we consider three natural protocols that apply to both problems. We compute 
the objective and subjective PARs for all three protocols for both problems. 
The objective and subjective PARs are exponential in all cases, but we show 
that the protocol that is intuitively the best is quantifiably (and significantly) 
more fair than the others in the sense described below; to do this, we consider 
the ratios of the subjective PARs (as described in Sec. 12. 3|) and argue that this 
captures some intuitive sense of fairness. Table [1] in Sec. [3] summarizes our 
results for PAR values for the various problems and protocols that we consider 
here; the corresponding theorems and proofs are in Sees. 2] and [5] 

1.2 Related Work: Denning Privacy-Preserving Compu- 
tation 

In addition to Brandt and Sandholm [3J, who used Kushilevitz's formulation 
of privacy-preserving computation to prove an exponential lower bound on the 
communication complexity of privacy-preserving 2 nd -price Vickrey auctions, the 
privacy work of Bar- Yehuda et al. [1] is also based on the communication- 
complexity framework of [IJ[TT] . 

Among other approaches to privacy-preserving computation, the most exten- 
sively developed is that of secure, multiparty computation (SMC). As observed 
by Brandt and Sandholm [3_, bidders' privacy in online auctions, which was our 
original motivation as well as theirs, could in principle be achieved by starting 
with a strategyproof mechanism and then having the agents themselves com- 
pute the outcome and payments using an SMC protocol. This approach has been 
followed successfully by, for example, Dodis, Halevi, and Rabin [5] and Naor, 
Pinkas, and Sumner [14] but, as discussed in more detail [3,8 , can in general 
require assumptions about the strategic nature of the computational nodes that 
do not apply to bidders in auctions, unproven cryptographic assumptions, or 
excessive communication costs. Thus, non-SMC approaches are worth pursuing. 

In our study of PAR, we consider protocols that compute exact results but 
preserve privacy only approximately. Several works, including [2|[7lll0j, have 
considered protocols that compute approximate results in a privacy-preserving 
manner, but they are unrelated to the questions we ask here. Similarly, defi- 
nitions and techniques from differential privacy [6] (and its mechanism-design 
extensions J9 S I3J) are aimed at computing approximate results and are inappli- 
cable to the problems that we study here. 
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1.3 Paper Outline 

In Sec.[3J we review the PAR framework of [5] and discuss the ratios of average- 
case subjective PARs. Section [3] gives formal definitions of the problems we 
study, describes the protocols for these problems that we consider, and gives 
a summary and discussion of our PAR results. Sections 2] and [5] give the full 
statements and proofs of our PAR results. Section [6] discusses avenues for future 
work. Appendix [X] provides additional background about our approach, and 
App. iBl Sections 12 . 1 1 and [2.21 and Apps . |A1 and IB1 are drawn from [5]; we include 
them here for the convenience of the reader. 

2 Privacy Approximation Ratios 

We now review our formulations of Privacy Approximation Ratios (PARs) [5] . 
We refer readers to Section IA.2I of the Appendix below for a more thorough 
explanation. We assume that the reader is familiar with Yao's model of two- 
party communication. Readers unfamiliar with this material should refer to Sec- 
tion lA.ll of the Appendix below or, for a more in-depth treatment, to Kushilevitz 
and Nisan [j"2] . 

Chor and Kushilevitz [HE] put forth definitions and characterizations of 
perfectly private communication protocols. Their framework was further devel- 
oped in [5], where we introduced the notion of PARs. In this paper, as in [SJ, 
we deal only with deterministic communication protocols, but the framework 
can be extended to randomized protocols. 

As explained in the previous section, there are natural problems for which 
perfect privacy is either impossible or very costly (in terms of communication 
complexity) to obtain. Privacy-approximation ratios (PARs) allow us to quan- 
tify how well a protocol preserves privacy relative to the ideal (but perhaps 
impossible to implement) computation of the outcome of a problem. Approxi- 
mate privacy has both worst-case and average-case formulations. 

2.1 Worst-Case PARs 

Any function / : {0, l} fe x {0, l} k — > {0, 1}' can be visualized as a 2 fc x 2 fe 
matrix with entries in {0, 1}', in which the rows represent the possible inputs 
of party 1, the columns represent the possible inputs of party 2, and each entry 
contains the value of / associated with its row and column inputs. This matrix 
is denoted A(f). 

For any communication protocol P for a function /, let R p {x\ , a; 2) denote the 
monochromatic rectangle in A(f) induced by P for the pair of inputs (xi,X2). 
Let R (xi,x%) denote the maximal monochromatic region in A(f) containing 
A{f)( Xl .x 2 )i i- e -i the maximal set of entries in A(f) that contain the value 
f(xi,X2)- Intuitively, R p (xi,X2) is the set of inputs that are indistinguish- 
able from (xtfXz) to this particular protocol P. Similarly, R I (xi,X2) is the set 
of inputs that would be indistinguishable from (xi,X2) to a perfectly private 
protocol if such a protocol existed. We wish to quantify how far P is from a 
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hypothetical ideal protocol in terms of indistinguishability of inputs. Let \R\ 
denote the size or cardinality of R, i.e., the number of inputs in R. 

Definition 2.1 (Worst-case objective PAR of P). The worst-case objective 
privacy- approximation ratio of communication protocol P for function / is 

\R I (x u x 2 )\ 
a = max - — =- — . 

(x u x 2 ) \R^(Xi,X2)\ 

We say that P is a-objective-privacy-preserving in the worst case. 

Given any region R in the matrix A(f), if party l's private input is x, then 
party 1 can use this knowledge to eliminate all entries in R outside of row x; 
similarly, party 2 can eliminate all parts of R outside of the appropriate column. 
Hence, the other parties should be concerned not with all of R but rather with 
what we call the i-partitions of R. 

Definition 2.2 (i-partitions). The 1-partition of a region R in a matrix A is the 
set of disjoint rectangles R Xl = {x\} x {x 2 s.t. (xi,x 2 ) € R} (over all possible 
inputs Xi). 2-partitions arc defined analogously. 

Definition 2.3 (i-induced tilings). The i-induced tiling of a protocol P is the 
refinement of the tiling induced by P obtained by i-partitioning each rectangle 
in it. 

Definition 2.4 (i-ideal monochromatic partitions). The i-ideal monochromatic 
partition is the refinement of the ideal monochromatic partition obtained by i- 
partitioning each region in it. 

If P is a communication protocol for the function /, then we let Rf (xi, x 2 ) 
denote the monochromatic rectangle containing A(f)r xltX2 \ in the z-induced 
tiling for P. Similarly, we let R((xi,x 2 ) denote the monochromatic rectangle 
containing A(f)r XlX2 \ in the i-ideal monochromatic partition of A(f). 

Definition 2.5 (Worst-case PAR of P with respect to i). The worst-case 
privacy- approximation ratio with respect to i of communication protocol P for 
function / is 

\Rj{x u x 2 )\ 
a = max - — p- — . 

(xi,x 2 ) \R[ (xi,x 2 )\ 

We say that P is a-privacy-preserving with respect to i in the worst case. 

Definition 2.6 (Worst-case subjective PAR of P). The worst-case subjective 
privacy- approximation ratio of communication protocol P for function / is the 
maximum, over i = 1,2, of the worst-case privacy- approximation ratio with 
respect party i. 

Definition 2.7 (Worst-case PAR). The worst-case objective (subjective) PAR 
for a function f is the minimum, over all protocols P for /, of the worst-case 
objective (subjective) PAR of P. 
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2.2 Average-Case PARs 

As we showed in [8], good approximate privacy may be just as unobtainable 
as perfect privacy if one insists on worst-case bounds. Thus, we also consider 
average-case PAR, i. e. , the average ratio between the size of the monochromatic 
rectangle containing the private inputs and the corresponding region in the ideal 
monochromatic partition. 

Definition 2.8 (Average-case objective PAR of P). Let D be a probabil- 
ity distribution over the space of inputs. The average-case objective privacy- 
approximation ratio of communication protocol P for function / is 

a = E D [- 



L \R p (x u X2)\ r 

We say that P is a-objective privacy-preserving in the average case with 
distribution D (or with respect to D). 

We define average-case PAR with respect to i analogously and average-case 
subjective PAR as the maximum over i of the average-case PAR with respect 
to player i. Finally, we define the average-case objective (subjective) PAR for 
a junction f as the minimum, over all protocols P for /, of the average-case 
objective (subjective) PAR of P. 

In computing the average-case PAR (either objective or subjective) with 
respect to the uniform distribution, we may simplify the previous expressions 
for PAR values. If each player's value space has k bits, then the average-case 
objective PAR with respect to the uniform distribution equals 

PAR - V 1 Ig^llfgU 
* ,^ 2^\RP( Xl ,x 2 )y 

where the sum is over all pairs (x%, x%) in the value space. We may combine all 
of the terms corresponding to points in the same protocol-induced rectangle to 
obtain 

\S\ \&{S)\ 



PAR* 



where the sums are now over protocol-induced rectangles S. Note also that the 
average-case PAR with respect to i and with respect to the uniform distribution 
is obtained by replacing R I (S) with Rf (S) in Eq. [TJ 

It may seem that a probability-mass-based definition of average-case PAR 
should be used instead, i.e., that the occurrences of set cardinality in the quan- 
tity considered in Def. 12.81 should be replaced by the probability measure of the 
regions in question. However, as we discuss in [8], such a definition is unable 
to distinguish between examples that should be viewed as having very different 
levels of privacy; by contrast, the definition that we consider here is able to 
distinguish between such cases. 
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2.3 Ratios of Subjective PARs 

Here we introduce a new quantity that we did not consider in [8J. Given some 
protocol P for a function /, let PAR^(fc) be the average-case subjective PAR of 
P with respect to protocol participant i and distribution D on the fc-bit input 
space. We then let 

PAR™ x (fc) = maxPAR^(fc) and PARS in (fc) = minPAR^(fc), 

i i 

where the max and min are taken over all protocol participants. We then dchnc 
the ratio of (average-case) subjective PARs to be 

PART (fc) > 1 
PARS'" (fc) " 

Intuitively, in a two-participant protocol, this captures how much greater a 
negative effect the protocol P can have on one participant than on the other 
participant. The average-case subjective PAR of a protocol P identifies the 
maximum effect that P can have on the privacy with respect to a participant. 
However, it does not capture whether this effect is similar for both players, and 
in fact this effect can be quite different. Below we show that, for both the dis- 
jointness and intersection problems, there are protocols that have exponentially 
large subjective PARs; for some protocols, the subjective PAR with respect to 
one player is exponentially larger than that with respect to the other player, 
while for one protocol for each problem, the subjective PARs with respect to the 
different players differ only by a constant (asymptotic) factor. We argue that 
this is an important distinction and that the ratio of average-case subjective 
PARs captures some intuitive notion of the fairness of the protocol. If a proto- 
col has a much larger PAR with respect to player 2 than with respect to player 
1, an agent might agree to participate in a protocol run only if he is assigned 
the role of player 2 (so that he learns much more about the other player than 
the other player learns about him). Thus, from the perspective of the protocol 
implementer who needs to induce participation, protocols with small ratios of 
average-case subjective PARs would likely be more desirable. 

3 Overview of Problems, Results, and Protocols 

We now provide an overview of our PAR results and discuss their significance. 
We start with technical definitions of the problems and protocols that we con- 
sider here. 

3.1 Problems 

We define the DiSJOlNTNESSfc problem as follows: 
Problem: DiSJOlNTNESSfc 

Input: Sets Si, S% C {1, . . . , k} encoded by xi and x^. 
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Output: 1 if Si n S 2 = 0, if Si n S 2 ^ 0. 

Figure [1] illustrates the ideal monochromatic partition of the 3-bit value 
space; inputs for which Si and S2 are disjoint are white, and inputs for which 
these sets are not disjoint are black. 



Figure 1: Ideal monochromatic partition for Disjointness^ with k = 3. 

We define the Intersection^ problem as follows: 
Problem: Intersection^ 
Input: Sets Si, S 2 C {1, . . . , k}. 
Output: The set Si n S 2 . 

Figure [2] shows the ideal monochromatic partition of the 3-bit value space 
for Intersection^ . The key at the right indicates the output set. (Here, 
as throughout this paper, we encode S C {1, . . . , k} as bitstring of length k 
in which the most significant bit is 1 if k £ S, etc., so that 1011 encodes 
{1,2,4} C {1,2,3,4}; we will abuse notation and identify x £ {0, l} k with the 
subset of {1, ... , k} that it encodes.) 



For each problem, we identify three possible protocols for computing the output 
of the problem. We describe these protocols here; in Sees. [4] and [5] we discuss the 
structure of the tilings that these protocols induce and illustrate these tilings 
for k = 1,2,3. 

Trivial protocol In the trivial protocol, player 1 (w.l.o.g.) sends his input to 
player 2, who determines computes the output and sends this back to player 1. 
This requires the transmission of k + 1 bits for Disjointness^ and 2k bits for 
Intersection^. 

1-first protocol In the 1-first protocol, player 1 announces a bit, and player 
2 replies with his corresponding bit if its value might affect the output (i.e., if 
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3.2 Protocols 
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Figure 2: Ideal monochromatic partition for Intersection^ problem with k = 
3. 

player l's value for this bit is 1); this continues until the output is determined. 
In detail, player 1 announces the most significant (first) bit of x\. After player 
1 announces his j bit, if this bit is and j < k, then player 1 announces his 
(j + l) st bit. If this bit is and j = fc, then the protocol terminates (with, if 
computing DiSJOiNTNESSfc, output 1). If this bit is 1, then player 2 announces 
the value of his j th bit. If player 2's j th bit is also 1, then for DlSJOlNTNESSfe 
the protocol terminates with output 0, and for Intersection^ the protocol 
continues (with k + 1 — j in the output set); if player 2's bit is and j < k, then 
player 1 announces his (j + l) st bit, while if j = fc, then the protocol terminates. 

Alternating protocol In the alternating protocol, the role of being the first 
player to announce the value of a particular bit alternates between the players 
whenever the first player to announce the value of his j th bit announces "0" (in 
which case the other player does not announce the value of his corresponding 
bit). This continues until the output is determined. In detail, player 1 starts by 
announcing the most significant (first) bit of x\. After player i announces the 
value of his j th bit, if this bit is and j < fc, then the other player announces 
his j + 1 st bit; if i's j th bit is and j = k, the protocol terminates (with output 
1 if computing DlSJOlNTNESSfe). 

If i's j th bit is 1 and the other player had previously announced his j th bit 
(which would necessarily be 1, else player i would not be announcing his j th 
bit), then the protocol terminates with output if computing DisjointnesS/c, 
or it continues with the other player announcing his (j + l) st bit (and with 
k + 1 — j being part of the output set). If i's j th bit is 1 and the other player 
had not previously announced his j th bit, then the other player announces his 
j th bit; if that bit is 0, then player i proceeds as above. If that bit is 1 and 
DlSJOlNTNESSfe is being computed, the protocol terminates with output 0; if 
the bit is 1 and Intersection^ is being computed, then player i proceeds as 
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above (and k + 1 — j will be in the output set). 
3.3 Results 

Table [1] summarizes our PAR results for the DlSJOlNTNESSfc and Intersec- 
TlONfc problems. The rows labeled with "All" describe bounds for all protocols 
for that problem (as reflected by the inequalities). Asymptotic results are for 
k — s- oo; entries of " — " for bounds on subjective PARs indicate that we do not 
have results beyond those implied by the PARs for specific protocols. For Inter- 
SECTiONfe, the results for the trivial and 1-first protocols are shown together; 
as shown in Lemma 15. R these protocols induce the same tiling, so the PAR 
results are the same. All of these results are for average-case objective PARs 
with respect to the uniform distribution. These include objective and subjective 
PARs and the ratio of the subjective PARs. 



Problem 
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Subjective PAR 


Ratio of 
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Table 1: Summary of results. Asymptotic results are for k — > oo. 



3.3.1 Discussion of results for Disjointness^- 

All three protocols have the lowest possible average-case objective PAR for DlS- 
JOINTNESSfe. They also have average-case subjective PARs that are exponential 
in k, although the bases differ. When considering these protocols (and the tilings 
they induce as depicted in Sec.U]), however, our intuition is that players are much 
less likely to participate in the trivial and 1-first protocols (if they do so as player 
1) than they are to participate in the alternating protocol. This is captured by 
the comparison of the average-case subjective PAR with respect to the two 
players in each protocol: In the trivial and 1-first protocols, the subjective PAR 
with respect to player 2 is exponentially worse than the subjective PAR with 
respect to player 1; by contrast, in the alternating protocol the subjective PARs 
differ (asymptotically) by a constant factor. Wc do not have any absolute lower 
bound for the average-case subjective PAR for DlSJOINTNESSfe. However, we 
conjecture that this grows exponentially. 
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Conjecture 3.1. The average-case subjective PAR for DlSJOINTNESSfc with 
respect to the uniform distribution grows exponentially in k. 

3.4 Discussion of results for Intersection*. 

From a high-level perspective, the PAR results for Intersection^ are very 
similar to those for DlSJOiNTNESSfc. As for their DlSJOiNTNESSfc variants, all 
three protocols have exponentially large average-case objective PAR for Inter- 
section/^ we show that the average-case objective PAR for Intersection^ is 
also exponential in k, and we conjecture that this bound can be tightened to 
match the 2 fc asymptotic growth of the average-case objective PAR for all three 
of these protocols. 

Conjecture 3.2. The average-case objective PAR for Intersection^ is asymp- 
totic to 2 k . 

All three protocols also have average-case subjective PARs that are exponen- 
tial in k, although the bases differ. Our intuition that the alternating protocol is 
significantly better is not captured by the average-case objective and subjective 
PARs, but we again see it when we consider the ratio of the subjective PARs: In 
the trivial and 1-first protocols, the subjective PAR for player 1 is exponentially 
worse than the subjective PAR for player 2; by contrast, in the alternating pro- 
tocol the subjective PARs differ by a constant factor of |. We do not have any 
absolute lower bound for the average-case subjective PAR for Intersection^-. 
However, as for DlSJOiNTNESSfc, we conjecture that this grows exponentially. 

Conjecture 3.3. The average-case subjective PAR for Intersection^, with 
respect to the uniform distribution grows exponentially in k. 

4 PARs for DiSJOiNTNESSfc 

4.1 Structure of Protocol-Induced Tilings 

The tiling induced by the trivial protocol is straightforward. For every input 
5*1 ^ k held by player 1, there are two monochromatic rectangles in the corre- 
sponding row of the input space: {(Si, S 2 )\S 2 n Si ^ 0} and {(Si, S 2 )\S 2 C\Si = 
0}. The row corresponding to Si = k forms a single monochromatic rectangle. 

Figure |3] depicts the T first-protocol- induced tiling of the 1-, 2-, and 3- bit 
input spaces. Each tile is labeled with the transcript produced by the protocol 
on inputs from that tile; note that some tiles are depicted as non-contiguous 
regions. When the input space is depicted as in Fig. [3] (i.e., with the possible 
values of Si and S 2 arranged in increasing lexicographic order from the top-left 
corner), the tiling of the k + 1-bit input space induced by the 1-first protocol 
can be obtained as follows. Let Xfc be the 1-first-protocol-induced tiling of the 
fc-bit input space. The top-left and top-right quadrants of Tk+i are copies of 
Tk] in each of these quadrants, a trace in T^+i is the corresponding trace in 
Tfc prepended with 0. The bottom-left quadrant of Tfc + i is another copy of 
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Tfc, with each trace in this part of Tk+\ being obtained by prepending 10 to 
the corresponding trace in The bottom-right quadrant of Xfc+i is a single 
rectangle whose trace is 11. 
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Figure 3: Partition of the value space for k = 1 (top left), 2 (bottom left), and 
3 (right) induced by the 1-first protocol for DiSJOiNTNESSfe; each rectangle is 
labeled with the transcript output by the protocol when run on inputs in the 
rectangle. 



Figure U] shows the partition of the 1-, 2-, and 3-bit input spaces induced by 
the alternating protocol; each induced rectangle is labeled with the correspond- 
ing transcript (note that some rectangles appear as non-contiguous regions in 
the figure). If we denote by Tk the tiling of the A:-bit space induced by the al- 
ternating protocol as depicted in Fig. 01 then the bottom-left quadrant of Tk+i 
has the same structure as Tfc, with the transcript for a tile in Tk+i obtained 
by prepending 10 to the transcript for the corresponding tile in Tk- Each of 
the top quadrants has the same structure as the reflection of Tk across the top- 
left-to-bottom-right diagonal; the corresponding rectangles in these quadrants 
actually form single rectangles, and the associated transcript is obtained by 
prepending to the transcript for the corresponding rectangle in Tf.. Finally, 
the bottom-right quadrant is a single rectangle that always has the transcript 
11. 

4.2 Objective PAR 

4.2.1 Objective PAR for the Disjointness^ problem 

Lemma 4.1. In the ideal partition induced by DiSJOiNTNESSfe, at least 2 k rect- 
angles are required to tile the region / _1 (1). 
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Figure 4: Partition of the value space for k = 1 (top left), 2 (bottom left), and 
3 (right) induced by the alternating protocol for Disjointness^; each rectangle 
is labeled with the transcript output by the protocol when run on inputs in the 
rectangle. 



Proof. As shown in, e.g., [12], the 2 k input pairs (S, {1, . . . , k} \ S) form a 
"fooling set" — no two of these input pairs can belong to the same monochromatic 
rectangle. □ 

Corollary 4.2. The average-case objective PAR of DisjointnesS/c with respect 
to the uniform distribution is at least (I J . 

Proof. The contribution to the sum in Eq. [1] from the protocol-induced tiles 
S c must be at least 2 k ■ 3 k , so the average-case objective PAR with 

respect to the uniform distribution is at least □ 

4.2.2 Objective PAR for specific protocols 

Lemma 4.3. If a protocol P for DlSJOiNTNESSfe tiles with 2 fe tiles and 

tiles /^(O) with 2 k — 1 tiles, then the average-case objective PAR of P with 
respect to the uniform distribution equals 



2 k - 1 



3 x k 



Proof. By the argument for Cor. 14.21 the contribution to this PAR value from 



those S C is The contribution to this PAR value from those 

S C / _1 (0) is 4~ fe ■ (A k — 3 k ) ■ (2 k — 1). Summing these together, we obtain the 
claimed value. □ 
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Proposition 4.4. The average-case objective PAR of the trivial protocol for 
DlSJOlNTNESSfe with respect to the uniform distribution is 




Proof. The trivial protocol tiles / _1 (1) with 2 fc tiles (one for each set Si that 
player 1 might have), and it tiles / _1 (0) with 2 fc — 1 tiles (one for each non-empty 
set Si that player 1 might have). We may then apply Lemma FOl □ 

Proposition 4.5. The average-case objective PAR of the 1- first protocol for 
DlSJOlNTNESSfe with respect to the uniform distribution is 




Proof. The protocol- induced tiles of / 1 (1) correspond bijectively to the 2 k pos- 
sible protocol transcripts {0, 10} fc , while the protocol-induced tiles of / _1 (0) cor- 
respond bijectively to the 2 fc — 1 possible protocol transcripts {{0, 10} 1 x {11}} 
We may then apply Lemma l4~3l □ 

Proposition 4.6. The average-case objective PAR of the alternating protocol 
for DlSJOlNTNESSfe with respect to the uniform distribution is 




Proof. The protocol- induced tiles of / 1 (1) correspond bijectively to the 2 k pos- 
sible protocol transcripts {0, 10} fe , while the protocol-induced tiles of / _1 (0) cor- 
respond bijectively to the 2 fe — 1 possible protocol transcripts {{0, 10} 1 x {ll}} i ( 
We may then apply Lemma |4~51 □ 



4.3 Subjective PAR 

4.3.1 Subjective PAR for the trivial protocol 

Proposition 4.7. The average-case PAR with respect to player 1 of the trivial 
protocol for DlSJOlNTNESSfe is 1. The average-case PAR with respect to player 
2 of the trivial protocol for DlSJOlNTNESSfe, and thus the average-case subjective 
PAR for the protocol, is 




(k -> oo). 



Proof. The 1-partition induced by the trivial protocol is exactly the ideal 1- 
partition, from which the first claim follows. 

The 2-partition induced by the trivial protocol distinguishes between every 
pair of distinct inputs. To compute the average-case PAR with respect to player 



14 



2, we use v k and v\ to denote the contributions (in the fc-bit version of the 
problem) to the sum in Eq. [T]from tiles in / _1 (0) and / (1), respectively, so 
the average-case PAR with respect to player 2 is then (y° + vl) /A k . 

Let S be a 2-rectangle induced by the trivial protocol in the k + 1-bit value 
space (so S is 1 x 1). If S is in either the bottom- left or the top- left quadrant, 
then the size of the ideal rectangle containing S is twice the size of the ideal 
rectangle that contains the corresponding induced rectangle in the fc-bit value 
space (i.e., the point in the fc-bit space obtained by omitting the first bit of each 
input in S when the value space is depicted as in Fig.Q]). This holds regardless 
of whether S C / _1 (0) or S C If S is in the top-right quadrant and 

S C / _1 (1), then the size of the ideal rectangle containing S is the same as that 
of the ideal rectangle containing the corresponding input in the fc-bit value space; 
note that the bottom-right quadrant does not contain any points in / _1 (1). If S 
is in the top-right quadrant and S C / _1 (0), then the size of the ideal rectangle 
containing S is that of the ideal rectangle containing the corresponding input 
in the fc-bit value space plus 2 fc ; the extra contribution of 2 k is added on for 
each of the A k — 3 k protocol-induced 2-rectangles in the top- right quadrant. If S 
is in the bottom-right quadrant (so that it is necessarily contained in /~ 1 (0)), 
then the size of the ideal rectangle containing S is at least 2 fc (the part of the 
containing rectangle that is in the bottom-right quadrant); the amount by which 
this exceeds 2 k equals the size of the ideal 2-rectangle (for / _1 (0)) containing 
the corresponding point in the fc-bit value space. In particular, each of the 2- 
rectangles for the fc-bit value space is counted for exactly 2 fc induced rectangles 
in the bottom-right quadrant, so the entire excess contribution is 2 fe (4 fe — 3 fe ). 

We thus obtain the following recurrences (the terms are grouped by quad- 
rant, clockwise from the bottom left). 

v° k+1 = 2v° k + 2v k + (v° k + 2 k (4 k -3 k )) + (4 k -2 k + 2 k -(4 k -3 k )) «f = 1 
Wfc+i = 2^ + 2^ + ^ + v\ = 5 

From these, we obtain v\ = 5 fc and 

v = 2 3k _ 2 i+k 3 k + gfe^ 

from which it follows that the average-case subjective PAR with respect to 
player 2 (and thus for the trivial protocol) is 

^(8 fe - (2 fe+1 3 fe ) + 2 - 5 fe ) = 2 fe - 2 + 2 (j) ■ 

□ 

Corollary 4.8. //PARf ivial denotes the average-case PAR w.r.t. i of the trivial 
protocol for Disjointness^ w.r.t. the uniform distribution, then 

PA ptrivial 

KAK t 2 . . , - 2 k (fc^oo). 

p AR tnvial 
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4.3.2 Subjective PAR for the 1-first protocol 

Theorem 4.9. The average-case PAR with respect to player 1 of the 1-first 
protocol for Disjointness^ with respect to the uniform distribution is 

The average-case PAR with respect to player 2 of the 1-first protocol for DlS- 
JOlNTNESSfc with respect to the uniform distribution is 

Proof. To compute the average-case PAR with respect to player 1, we use /i° 
and h\ to denote the contributions (in the fc-bit version of the problem) to the 
sum in Eq.[T]from the 1-induced tiles in /~ 1 (0) and / _1 (1), respectively, so the 
average-case PAR with respect to player 1 is then (hP k + h^) /4 fe . 

Let S C be a 1-rectangle induced by the 1-first protocol in the (fc + 

l)-bit value space. If S is in the bottom-left quadrant, the ideal 1-rectangle 
containing S is the same size as the ideal rectangle that contains S in the fc- 
bit value space (because there are no inputs in the bottom-right quadrant in 
/ (1)). If S is in one of the top quadrants, then the ideal 1-rectangle containing 
S is twice the size of the rectangle containing the rectangle that corresponds 
to S in the partition of the fc-bit value space. Observe that each point in the 
top-left quadrant is in the same rectangle as the corresponding point in the 
top-right quadrant; in particular, this means that the induced 1-rectangles in 
the top two quadrants correspond bijectively to the induced 1-rectangles in the 
fc-bit value space. S cannot be in the bottom-right quadrant, which contains no 
points in We thus have (separating the contributions of the bottom-left, 

top, and bottom-right quadrants) 

h\ +1 = /4 + 2/4 + = 3/4. 

By inspection, h\ = 1 + 2 + = 3; so h\ = 3 k . 

Now let S C J _1 (0) be a 1-rectangle induced by the 1-first protocol in the 
(k + l)-bit value space. If S is in the bottom-left quadrant, then the size of the 
ideal 1-rectangle containing S equals the size of the ideal 1-rectangle containing 
S in the fc-bit value space plus 2 fe (because all of the inputs in the bottom- 
right quadrant in the same 1-rectangle as S are in the same ideal 1-rcctanglc 
as S). If nHl denotes the number of induced 1-rectangles S C / _1 (0) in the 
bottom-left quadrant (this is the same as the total number of such 1-rectangles 
in the fc-bit space), then the total extra contribution is 2 k nH®. If S is in the top 
two quadrants, the same arguments as before apply. If S is in the bottom-right 
quadrant (so that the size of S is 2 fe ), then the ideal 1-rectangle containing S has 
size 2 fc plus the size of whatever part of the ideal 1-rectangle lies in the bottom- 
left quadrant. If we sum over all 2 fc rectangles S in the bottom-right quadrant, 
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the extra contribution from the bottom-left quadrant equals the total size of 
/ _1 (0) in the fc-bit value space, i.e., i k — 3 k . This leads to (again separating 
the contributions of the bottom-left, top, and bottom-right quadrants) 

h° k+1 = (h° k + 2 k nH° k ) + 2h\ + ((4 fc - 3 k ) + 4 fc ). 

By inspection, h\ = 1. Because the bottom- left quadrant is a copy of the tiling 
of the fc-bit space, the top two quadrants have the same number of rectangles, 
and the bottom-right quadrant has 2 k 1-rectangles, we have 

nH k+1 =nH° k +nH° k +2 k , 

with nH® = 1. From this, we obtain 

nH° k =k-2 k -\ 

which we then use to obtain 

^ = ^(3-4 fc -2.3 fc ). 

Using PARi to denote the average-case PAR with respect to 1, we have 

PAR! = ^(hl + hi) 

= J_ f-3-4 fc --2-3 fc + 3 fe 
4 k \6 6 

k k /3\ fc /3 N k 



2 3 \4 
as claimed. 

We now turn to the computation of the average-case PAR with respect to 
player 2. We use v k and v k to denote the contributions (in the fc-bit version 
of the problem) to the sum in Eq. [1] from the 2-induced tiles in / _1 (0) and 
/ _1 (1), respectively, so the average-case PAR with respect to player 2 is then 
K + ^)/4*. 

Let 5c/ : (1) be a 2-rectangle induced by the 1-first protocol in the (fc + 
l)-bit value space. If 5 is in the bottom-left quadrant, the ideal 2-rectangle 
containing 5 is twice as big as the ideal 2-rectangle that contains 5 in the fc-bit 
value space. The same holds true if 5 is in the top-left quadrant. If 5 is in 
the top-right quadrant, the ideal 2-rectangle containing 5 is the same size as in 
the fc-bit value space. Finally, the bottom-right quadrant does not contain any 
values in / _1 (1). Thus, we have (again listing contributions clockwise from the 
bottom-left quadrant) 

v 1 k+1 = 2v 1 k + 2v 1 k + v 1 k + = 5v 1 k . 
By inspection, v\ = 5; so, v\ = 5 k . 
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Now let S C / _1 (0) be a 2-rectangle induced by the 1-first protocol in the 
(k + l)-bit value space. If S is in the bottom-left or top-left quadrant, the ideal 
2-rectangle containing S is twice as big as in the fc-bit value space. If S is in 
the top-right quadrant, the size of the ideal 2-rectangle containing S equals 2 k 
plus the size of the ideal 2-rectangle that contains S in the fc-bit value space. 
Finally, if we sum over all S in the bottom-right quadrant, the total sizes of the 
ideal 2-rectangles containing these S is 2 k ■ 2 k plus the total size of / _1 (0) in 
the fc-bit value space. Combining all of these relations, and using nV® to denote 
the number of 2-rectangles in / _1 (0) in the fc-bit value space, we have 

v° k+1 = 2vl + 2v k + („° + 2 k ■ nV£) + (4 fe + 4 fc - 3 fe ). 

(As above, contributions are grouped by quadrant clockwise from the bottom 
right.) We also have 

nV fe ° +1 = nV fc ° + nVg + nV£ + 2 k = 3nF fe ° + 2 k . 

By inspection, v® = 1 and nV® = 1. From this, we obtain 

nVg = 3 k - 2 k 

and then 

vl = -4 k + ^3 k + 6 k -^5 k . 
Using PAR 2 to denote the average-case PAR with respect to 2, we have 

PAR 2 = ^{vl+vl) 



§)'*K!)'- 1+ KC 

as claimed. □ 

Corollary 4.10. The average-case subjective PAR of the 1-first protocol for 
DlSJOlNTNESSfe with respect to the uniform distribution is 

3\ fe 1 /5\ fe 1 /3\ fe /3 X 



2 V 47 ~UJ 

Corollary 4.11. // PARj _finrt denotes the average-case PAR w.r.t. i of the 
1-first protocol for DlSJOlNTNESS^ w.r.t. the uniform distribution, then 



p AR l-first 2 /3 X fe 

PAR}- first ~ k 



(k -> oo). 
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4.3.3 Subjective PAR for the alternating protocol 

We let PAR^ denote the PAR w.r.t. i for the alternating protocol for DlSJOlNT- 
NESSfe. We let h\ and v\ be the contributions of / -1 (1) to the sums analogous 
to that in Eq. Q]for objective PAR, i.e., 

h\= J2 \ Rl ( s )\ v l= E \ Rl{ ?)\> 

SC/-i(l) TC/-i(l) 

where the sum for h\ is taken over protocol-induced "horizontal" rectangles S 
(in the induced 1-partition) on which / takes the value 1, and the sum for v\ is 
taken over protocol-induced "vertical" rectangles T (in the induced 2-partition) 
on which / takes the value 1 . Using the structure of the induced tiling, we may 
obtain recurrences for h\ and v\ as follows. 

h\ = h 1 k _ 1 +2vl_ 1 +0 h\ =3 (2) 

= 2vl_ 1 + (2hl_ 1 + hl_ 1 )+0 v\=5 (3) 



v. 



In each recurrence, the first summand is the contribution from the bottom-left 
quadrant, the second summand is the contribution from the two top quadrants, 
and the third summand is the contribution from the bottom-right quadrant. 
From these recurrences, we obtain h\ — |2 2fc + and v\ = |2 2fc — 

We define h^ and v® analogously to capture the contributions of / _1 (0) to 
the sums under consideration; we will also keep track of the number of tiles in 
the 1- and 2-induced partitions on which / takes the value (the "horizontal" 
and "vertical" tiles, which we denote as nH® and nV®, respectively). 

We start with the following recurrences for nH® and nV fc °. 

nH° +1 = nff°+ny t °+2 t nH% = 1 

nVg +1 = nVg + (nHl + nH^) + 2 k nV? = 1 

From these, we obtain 

nH° k = -2 fe+1 + (1 - 3/(2^2)) • (1 - V2) k + ((1 + V2) k • (4 + 3y/2)) /4 
nV fe ° = -3-2' £ + (l-v / 2)' £ -(3/2-V2) + (l + v / 2) fc -(3/2 + V2) 

We obtain the following recurrences for /i° and v^. 

h° k+1 = (hl+nH°-2 k )+2vl + (4 k +4 k -3 k ) h\ = 1 

vl +1 = 2vl + 2hl + (hl + nHl-2 k ) + (A k + A k -3 k ) v° = 1 

From these, we may obtain 



hi = ^-j= ( 5 • 2 k+1 • (1 - V2) k • (-3 + 2V2) + 5 • 2 k+1 • (1 + v / 2) fe • (3 + 2^2) 



+ V2((-l) k - 7 ■ 2 2fc+3 + 5 • 3 fc+1 ) J 
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+ 5-2 fe+1 (l + V2) fc (3 + 2V2) 

We may now compute the PAR with respect to each of the two players as 

PARf (fc) = and RARf (fe) = 

Theorem 4.12. TTie average-case PAR with respect to player 1 o/ £/ie alternat- 
ing protocol for DlSJOlNTNESSfc with respect to the uniform distribution is 

PARf (fc) = JL ^(-1)* - 2 2fe+3 + 3 fe+1 

+ (4 - 3V2)(2 - 2V2) fe + (2 + 2\/2) fe (4 + 3\/2)^ 

4 + 3V2 / l + y^ y 

4 v 2 ; 



(fc -> oo) 



TTie average-case PAR with respect to player 2 o/ i/ie alternating protocol for 
DlSJOlNTNESSfc wii/i respect to the uniform distribution is 



PARf (fc) 



= -±_^-(-l)* + 5-3 fe -3-4 fc+1 



+ 2 fe+1 (3 - 2V2)(1 - V2) k + 2 k+1 (3 + 2\/2)(l + V2 

fc 



3 + 2\/2 /l + \/2 . 
o (fc^oo) 



Corollary 4.13. TTie average-case subjective PAR of the alternating protocol 
for DlSJOlNTNESSfc with respect to the uniform distribution is 

^^^-(-l) fe +5-3 fe -34 fe+1 +2 fe+1 (3-2V2)(l-V2) fe +2 fe+1 (3+2V2)(l+V2) fe ^ 

3 + 2^2 / I + V2 V 
2 I 2 J 



(fc -» 00) 



Corollary 4.14. If PARf (fc) denotes the average-case PAR w.r.t. i of the 
1 -first protocol for DlSJOlNTNESS^ w.r.t. the uniform distribution, then 

PARf (fc) r- 

~ V2 (fc^oo). 

PARf(fc) 
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5 PARs for Intersection^ 



5.1 Structure of Protocol-Induced Tilings 

First, we observe that for Intersection^, the trivial and 1-first protocols in- 
duce the same tiling. 

Lemma 5.1. The tilings induced by the trivial and 1-first protocols for Inter- 
SECTIONfe are identical. 

Proof. Given two input pairs (Si, S2) and (T\, T2), each of these protocols can- 
not distinguish between the pairs if and only if (1) Si = T\ and (2) Si and Ti 
differ only on elements that are not in Si = T\. □ 

Figure [5] depicts the tilings of the 1-, 2-, and 3- bit value spaces induced by 
the trivial and 1-first protocols for Intersection^. If we denote by Tk the 
1-first-protocol-induced tiling of the fc-bit input space, then when we depict 
Tk+i as in Fig. El the bottom-left quadrant is lOTk (i.e., the fc-bit tiling with 
10 prepended to each transcript), each of the top quadrants is OTk, and the 
bottom-right quadrant is lllfe. 
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Figure 5: Partition of the value space for fc = 1 (top left), 2 (bottom left), and 
3 (right) induced by the trivial and 1-first protocols for Intersection^; each 
rectangle is labeled with the transcript output by the protocol when run on 
inputs in the rectangle. 



Figure H] depicts the tilings of the 1-, 2-, and 3-bit value spaces induced by the 
alternating protocol for Intersection^ . If we denote by Tk the alternating- 
protocol-induced tiling of the fc-bit value space and depict T^+i as in Fig. [6l 
the bottom-left quadrant is lOXfc (i.e., the fc-bit tiling with 10 prepended to 
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each transcript), each of the top quadrants is OTj (i.e., the fe-bit tiling reflected 
across the top- left -bottom- right diagonal), and the bottom- right quadrant is 
lllfc. 
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Figure 6: Partition of the value space for k = 1 (top left), 2 (bottom left), and 3 
(right) induced by the alternating protocol for Intersection^; each rectangle 
is labeled with the transcript output by the protocol when run on inputs in the 
rectangle. 



5.2 Objective PAR 
5.2.1 Lower bound 

We obtain the following result for the average-case objective PAR of the Inter- 
SECTlONfe problem. 

Theorem 5.2. The average-case objective PAR of the Intersection^, problem 
with respect to the uniform distribution is (■?) . 

Proof. We show that PAR fe+ i = |PAR fe and that PARi = \. 
Using Eq. [TJ we may write PAR^+i as 



par ^ = ^ttt( E 1^)1+ E ( 4 ) 

v Ji=/- l (0...) R=f~ 1 (_l...) 



where the first sum is over induced rectangles R in which the intersection set does 
not contain fe+1 (i.e., the encoding of the set starts with 0) and the second sum is 
over induced rectangles R in which the intersection set does contain this element. 
Observe that the ideal monochromatic partition of the region corresponding to 
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inputs in which k + 1 € Si fl ^ (the bottom-right quadrant when depicted as in 
Fig. [2]) has the same structure as the ideal monochromatic partition of the entire 
space when only k elements are used. Similarly, the three regions corresponding 
to k + 1 ^ Si U 52 (top- left quadrant), k + 1 G Si \ S2 (bottom-left quadrant), 
and fc + 1 G 52 \ 5i (top-right quadrant) all have this same structure, although 
each input in these regions belongs to the same monochromatic region as the 
corresponding inputs in the other two quadrants. 
The first observation allows us to rewrite Eq. [U as 



We now turn to rewriting the term in parentheses. 

Consider an input (Oxi, OX2) G / _1 (0x) (i.e., x, ir, € {0, l} fe and Xi Dx2 = x) 
in the top-left quadrant of the (fc-t-l)-bit input space (when depicted as in Fig. [2]). 
In any monochromatic tiling of this space, (0xi,0x2) may be in the same tile 
as at most one of the inputs (0xi,la;2) (top-right quadrant) and (lxi,Cte2) 
(bottom-left quadrant) — if both (Oxi, IX2) and (lxi, OX2) were in the same tile, 
then (lxi, IX2) <E f (lx) would also be in this tile, violating monochromaticity. 
If a x is the minimum number of monochromatic tiles needed to tile the region 
f (x) in the fc-bit input space, then at least 2a x monochromatic tiles are needed 
to tile the region / _1 (0x) in the (k + l)-bit input space. For any x € {0, l} k , 
the size of the ideal monochromatic region / _1 (0x) is 3 times the size of the 
monochromatic region / _1 (x) in the ideal partition of the input space for k- 
element sets. Thus the contribution to the sum (for PAR^+i) in Eq. [4] of the 
rectangles R in / _1 (0x) is 6 times the contributions of the contribution to the 
sum (for PARfc) of the rectangles R in / _1 (x). This allows us to rewrite Eq. [5] 
as 



Finally, the ideal partition for the Intersection^ problem with k = 1, shown 
in Fig. requires at least 2 tiles for the region (of size 3) corresponding to an 
empty intersection and a single tile for the region (of size 1) corresponding to a 
non-empty intersection. This immediately gives the initial condition 





PAR fc+ i = -PARfc-f-PARfc. 



PAR 1 = -(3 + 3 + 1) = -. 



□ 




1 




Figure 7: Ideal partition for the Intersection^ problem with k = 1. 
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5.2.2 Objective PAR for the trivial and 1-first protocols 



Proposition 5.3. The average-case objective PAR for the trivial and 1-first 
protocols for the INTERSECTION^ problem equals (|) • 

Proof. Consider the tiling Tk+i of the (fc + l)-bit value space induced by these 
protocols. Any tile 5 in Tk has 3 corresponding tiles in Tk+i'. the tile whose 
transcript (in the 1-first protocol) is 105, in the bottom-left quadrant; the tile 
whose transcript is 05, which spans the top two quadrants; and the tile whose 
transcript is 115, which is in the bottom-right quadrant. The ideal monochro- 
matic region that contains 05 and 105 (the same region contains both) in the 
(fc + l)-bit value space is 3 times the size of the ideal monochromatic region 
that contains 5 in the fc-bit value space; the ideal monochromatic region that 
contains 115 is the same size as the ideal monochromatic region that contains 
5. Thus, we have that PAR^+i = |PAR^. By inspection, PARi = |, finishing 
the proof. □ 



5.2.3 Objective PAR for the alternating protocol 

Although the recursive tiling structure induced by the alternating protocol is 
slightly different than that induced by the trivial and 1-first protocols, the ar- 
gument from the proof of Prop. l5.3l applies essentially unchanged. In particular, 
even though the structure is different, the tiles in Tk+i corresponding to a tile 
5 in Tk are: one tile in the bottom-left quadrant; one tile that spans the top 
two quadrants; and one tile in the bottom-right quadrant. Thus, we again have 
PAR/..+1 = |PARfe. Again, we also have PARi = |, giving us the following 
proposition. 

Proposition 5.4. The average-case objective PAR for the alternating protocol 
for the Intersection j; problem equals (j) k - □ 

5.3 Subjective PAR 

5.3.1 Subjective PAR for the trivial and 1-first protocols 

Remark 5.5. The contribution from / _1 (0) is as for DlSJOlNTNESS*,. What 
about the contribution for / _1 (^ 0)? 

Proposition 5.6. The average-case PAR with respect to player 1 of the triv- 
ial and 1-first protocols for Intersection^ is 1. The average-case PAR with 
respect to player 2 of the trivial and 1-first protocols for INTERSECTION^ is (|) . 

Proof. The 1-partition induced by the trivial protocol is exactly the ideal 1- 
partition, from which the first claim follows. 

For the second claim, we let Vk be the value of the sum in Eq. [TJ Let 5 
be a tile in the induced 2-tiling of the fc-bit input space; we will also use 5 to 
denote the 1-first-protocol transcript that labels 5. We now consider the tiles 
corresponding to 5 in the induced 2-tiling of the (fc + l)-bit input space. The 
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tile 10S in the bottom-left quadrant is contained in an ideal region that is twice 
as big as the one that contains S — this ideal region contains points in both the 
bottom-left and top-left quadrants; the same is true of the tile OS in the top-left 
quadrant. The tile OS in the top-right quadrant (which is a different 2-induced 
tile than the one in the top-left quadrant) is contained in an ideal region that 
is the same size as the ideal region containing S — this ideal region does not 
contain any points in the bottom-right quadrant. Finally, the tile 115 in the 
bottom-right quadrant is contained in an ideal region that is the same size as 
the ideal region containing S. Thus, we have that v k+1 = 6v k ; by inspection, 
v\ = 6, so Vk — 6 fc . Note that the average-case PAR with respect to 2 equals 
v k /4 k , completing the proof. □ 

Corollary 5.7. The average-case subjective PAR of the trivial and 1 -first pro- 
tocols for Intersection^ with respect to the uniform distribution is 




Corollary 5.8. //PAR* rlvlal denotes the average-case PAR w.r.t. i of the trivial 
protocol for Intersection k w.r.t. the uniform distribution, and if PAR 4 1_first 
denotes the average-case PAR w.r.t. i of the 1 -first protocol for Intersection k 
w.r.t. the uniform distribution, then 

PAP4 rivial _ PAR^ first _ (3\ k 
PAR'"™ 1 ~ PAR}- first ~ \2J 

5.3.2 Subjective PAR for the alternating protocol 

Proposition 5.9. The average-case PAR with respect to player 1 of the al- 
ternating protocol for INTERSECTION^, is | (|) . The average-case PAR with 
respect to player 1 of the alternating protocol for Intersection^ is | (|) fe - 

Proof. We let 

h k = J2\R[(S)\, 

s 

where the sum is taken over all induced 1-rectangles ("horizontal rectangles") 
in the fc-bit value space, and we let 

h k = J2\Ri(S)\, 

s 

where the sum is taken over all induced 2-rectangles ("vertical rectangles") in 
the fc-bit value space. 

Making use of the structure of the tiling, we have that 

v k+1 = 2v k + 2h k + h k +v k = 3(v k + h k ), 
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where the summands correspond to the contributions from each quadrant (clock- 
wise from the bottom-left quadrant). We also have 



ftfc+x = h k + 2v k +hk = 2(w fc + hk), 

where the summands correspond to the contributions from the bottom-left, top- 
two, and bottom-right quadrants, respectively. By inspection, we have h\ = 4 
and vi = 6; this gives h k = 4 • b^ 1 and v k = 6 • h*" 1 . Denoting by PARf^fc) 
the average-case PAR w.r.t. i of the trivial protocol for Intersection^, w.r.t. 
the uniform distribution, we have 

k 

PARf(fc) - /,A 4/ ' V " 
PARS 1 * (k) . - . 

2 V ) 4 fc 5 ^ 

as claimed. □ 

Corollary 5.10. The average-case subjective PAR of the alternating protocol 
for INTERSECTION;; is 

6 /5 X k 



4 k 5 \ 4 



6 /5 X k 



5 V4 

Corollary 5.11. If PAR^ 1 ' denotes the average-case PAR w.r.t. i of the trivial 
protocol for Intersection k w.r.t. the uniform distribution, then 

PARif _ 3 

3 alt 



PARf 2 



6 Conclusions and Future Work 

Our definitions of PARs involve the intuitive notion of the indistinguishability 
of inputs that is natural to consider in the context of privacy preservation. 
Other definitions of PARs may be appropriate in analyzing other notions of 
privacy. For example, if there is a natural notion of "distance" between inputs 
(as in the examples considered in this paper), one might prefer protocols that 
cannot distinguish among a few inputs that are far from each other to protocols 
that cannot distinguish among many inputs that are all relatively close. This 
necessitates different definitions of PARs and suggests many interesting avenues 
for future work. 

Starting from the same place that we did, namely [HE], Bar- Yehuda et 
al. [T] provided three definitions of approximate privacy. We show in [5] that 
the formulation in [T] is not equivalent to ours, but there is more to do along 
these lines. The definition in [T] that seems most relevant to the study of privacy- 
approximation ratios is their notion of h-privacy. Determine when and how it 
is possible to express PARs in terms of /i-privacy and vice versa. 
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Lower bounds on the average-case subjective PARs for Disjointness^ and 
Intersection^ would be interesting; as noted above, we conjecture that these 
are exponential in k. Our PAR framework should also be applied to other 
functions and extended to n-party communication. 
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A Perfect Privacy and Communication Complex- 
ity 

For convenience, we include Sec. 2 of (a revised version of) [5] as the text of this 
appendix. It contains the basic definitions of communication complexity and 
privacy that underlie our approach to approximate privacy. 

A.l Two-Party Communication Model 

We now briefly review Yao's model of two-party communication and notions 
of objective and subjective perfect privacy; see Kushilevitz and Nisan for 
a comprehensive overview of communication complexity theory. Note that we 
only deal with deterministic communication protocols. Our definitions can be 
extended to randomized protocols. 

There are two parties, 1 and 2, each holding a fc-bit input string. The 
input of party i, xi £ {0, l} fc , is the private information of i. The parties 
communicate with each other in order to compute the value of a function / : 
{0, l} fe x {0, l} fe — > {0,1}*. The two parties alternately send messages to each 
other. In communication round j, one of the parties sends a bit qj that is a 
function of that party's input and the history (q\, . . . , qj—x) of previously sent 
messages. We say that a bit is meaningful if it is not a constant function of 
this input and history and if, for every meaningful bit transmitted previously, 
there some combination of input and history for which the bit differs from 
the earlier meaningful bit. Non- meaningful bits (e.g., those sent as part of 
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protocol-message headers) are irrelevant to our work here and will be ignored. 
A communication protocol dictates, for each party, when it is that party's turn to 
transmit a message and what message he should transmit, based on the history 
of messages and his value. 

A communication protocol P is said to compute / if, for every pair of inputs 
(xi,X2), it holds that P(xi,x 2 ) — f(xi,X2)- As in [TT], the last message sent in 
a protocol P is assumed to contain the value f{x\ , X2) and therefore may require 
up to t bits. The communication complexity of a protocol P is the maximum, 
over all input pairs, of the number of bits transmitted during the execution of 
P. 

Any function / : {0, l} fe x {0, l} fc -> {0, 1}* can be visualized as a 2 k x 2 fe 
matrix with entries in {0, 1}', in which the rows represent the possible inputs 
of party 1, the columns represent the possible inputs of party 2, and each entry 
contains the value of / associated with its row and column inputs. This matrix 
is denoted by A(f). 

Definition A.l (Regions, partitions). A region in a matrix A is any subset of 
entries in A (not necessarily a submatrix). A partition of A is a collection of 
disjoint regions in A whose union equals A. 

Definition A. 2 (Monochromaticity). A region R in a matrix A is called monochro- 
matic if all entries in R contain the same value. A monochromatic partition of 
A is a partition all of whose regions are monochromatic. 

Of special interest in communication complexity are specific kinds of regions 
and partitions called rectangles, and tilings, respectively: 

Definition A. 3 (Rectangles, Tilings). A rectangle in a matrix A is a submatrix 
of A. A tiling of a matrix A is a partition of A into rectangles. 

Definition A. 4 (Refinements). A tiling Tj.(/) of a matrix A(f) is said to be a 
refinement of another tiling T^{f) of A(/) if every rectangle in is contained 

in some rectangle in T2(/). 

Monochromatic rectangles and tilings are an important concept in commu- 
nication-complexity theory, because they are linked to the execution of commu- 
nication protocols. Every communication protocol P for a function / can be 
thought of as follows: 

1. Let R and C be the sets of row and column indices of A(f), respectively. 
For R' C R and C C C, we will abuse notation and write R' x C to 
denote the submatrix of A(f) obtained by deleting the rows not in R' and 
the columns not in C . 

2. While R x C is not monochromatic: 

• One party i S {0,1} sends a single bit q (whose value is based on Xi 
and the history of communication). 
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• If i = 1, q indicates whether l's value is in one of two disjoint sets 
R\,R<2 whose union equals R. If x\ € R\, both parties set R = R±. 
If x\ £ i?2, both parties set R = R2. 

• If i = 2, q indicates whether 2's value is in one of two disjoint sets 
Ci,Ca whose union equals C. If X2 € C\, both parties set C = C\. 
If X2 & C2, both parties set C = C2. 

3. One of the parties sends a last message (consisting of up to t bits) con- 
taining the value in all entries of the monochromatic rectangle R x C . 

Observe that, for every pair of private inputs {x\, X2), P terminates at some 
monochromatic rectangle in A(f) that contains {x\,x%). We refer to this rect- 
angle as "the monochromatic rectangle induced by P for (xi,X2)". We refer to 
the tiling that consists of all rectangles induced by P (for all pairs of inputs) as 
"the monochromatic tiling induced by P" . 



Figure 8: A tiling that is not induced by any communication protocol [TT] 

Remark A. 5. There are monochromatic tilings that cannot be induced by com- 
munication protocols. For example, observe that the tiling in Fig. [5] (which is 
essentially an example from [11 ) has this property. 

A. 2 Perfect Privacy 

Informally, we say that a two-party protocol is perfectly privacy-preserving if 
the two parties (or a third party observing the communication between them) 
cannot learn more from the execution of the protocol than the value of the 
function the protocol computes. (This definition can be extended naturally to 
protocols involving more than two participants.) 

Formally, let P be a communication protocol for a function /. The commu- 
nication string passed in P is the concatenation of all the messages (gi, q%, . . .) 
sent in the course of the execution of P. Let si xl)X2 \ denote the communication 
string passed in P if the inputs of the parties are (£1,22)- We are now ready 
to define perfect privacy. The following two definitions handle privacy from the 
point of view of a party i that does not want the other party (that is, of course, 
familiar not only with the communication string, but also with his own value) to 
learn more than necessary about i's private information. We say that a protocol 
is perfectly private with respect to party 1 if 1 never learns more about party 
2's private information than necessary to compute the outcome. 
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Definition A. 6 (Perfect privacy with respect to 1). [UTTTj P is perfectly private 
with respect to party 1 if, for every x%, x' 2 such that f(xi, x%) — f(xi,x' 2 ), it holds 

that S(a;i,X2) = s (xi,x 2 )- 

Informally, Def. IA.6I savs that party l's knowledge of the communication 
string passed in the protocol and his knowledge of x\ do not aid him in distin- 
guishing between two possible inputs of 2. Similarly: 

Definition A. 7 (Perfect privacy with respect to 2). [HfTTj P is perfectly private 
with respect to party 2 if, for every xi,x'± such that f(xx, x%) = f(x' 1 ,x 2 ) 1 it holds 

that S( a ; l! x2) = s (x' 1 ,x 2 )- 

Observation A. 8. For any function /, the protocol in which party % reveals xi 
and the other party computes the outcome of the function is perfectly private 
with respect to i. 

Definition A. 9 (Perfect subjective privacy). P achieves perfect subjective pri- 
vacy if it is perfectly private with respect to both parties. 

The following definition considers a different form of privacy — privacy from 
a third party that observes the communication string but has no a priori knowl- 
edge about the private information of the two communicating parties. We refer 
to this notion as " objective privacy" . 

Definition A. 10 (Perfect objective privacy). P achieves perfect objective pri- 
vacy if, for every two pairs of inputs (£1,0:2) and (x[,x 2 ) such that f(x\,x 2 ) = 
f(x[,x 2 ), it holds that s (xiiX2 ) = s^^y 

Kushilevitz [TT] was the first to point out the interesting connections between 
perfect privacy and communication-complexity theory. Intuitively, we can think 
of any monochromatic rectangle R in the tiling induced by a protocol P as a 
set of inputs that are indistinguishable to a third party. This is because, by 
definition of R, for any two pairs of inputs in R, the communication string 
passed in P must be the same. Hence we can think of the privacy of the 
protocol in terms of the tiling induced by that protocol. 

Ideally, every two pairs of inputs that are assigned the same outcome by 
a function / will belong to the same monochromatic rectangle in the tiling 
induced by a protocol for /. This observation enables a simple characterization 
of perfect privacy-preserving mechanisms. 

Definition A. 11 (Ideal monochromatic partitions). A monochromatic region 
in a matrix A is said to be a maximal monochromatic region if no monochromatic 
region in A properly contains it. The ideal monochromatic partition of A is made 
up of the maximal monochromatic regions. 

Observation A. 12. For every possible value in a matrix A, the maximal monochro- 
matic region that corresponds to this value is unique. This implies the unique- 
ness of the ideal monochromatic partition for A. 
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Observation A. 13 (A characterization of perfectly privacy-preserving proto- 
cols) . 

A communication protocol P for / is perfectly privacy-preserving iff the monochro- 
matic tiling induced by P is the ideal monochromatic partition of A(f). This 
holds for all of the above notions of privacy. 

B Other Notions of Approximate Privacy 

For the convenience of the reader, we repeat the discussion from Sec. 6.1 of [5] 
(the revision dated the same date as this report) of other possible approaches 
to approximate privacy. 

By our definitions, the worst-case/average-case PARs of a protocol are deter- 
mined by the worst-case/expected value of the expression ppfcyp where -R p (x) 

is the monochromatic rectangle induced by P for input x, and i?^(x) is the 
monochromatic region containing A(f) x in the ideal monochromatic partition 
of A(f). That is, informally, we are interested in the ratio of the size of the ideal 
monochromatic region for a specific pair of inputs to the size of the monochro- 
matic rectangle induced by the protocol for that pair. More generally, we can 
define worst-case/average-case PARs with respect to a function g by considering 

the ratio ^ P ■ Our definitions of PARs set g(R, x) to be the cardinality of 
R. This captures the intuitive notion of the indistinguishability of inputs that 
is natural to consider in the context of privacy preservation. Other definitions 
of PARs may be appropriate in analyzing other notions of privacy. We suggest 
a few here; further investigation of these and other definitions provides many 
interesting avenues for future work. 

Probability mass. Given a probability distribution D over the parties' 
inputs, a seemingly natural choice of g is the probability mass. That is, for 
any region R, g(R) — Prrj(R), the probability (according to D) that the input 
corresponds to an entry in R. However, a simple example illustrates that this 
intuitive choice of g is problematic: Consider a problem for which {0, . . . , n} x 
{i} is a maximal monochromatic region for < i < n — las illustrated in 
the left part of Fig. [§J Let P be the communication protocol consisting of 
a single round in which party 1 reveals whether or not his value is 0; this 
induces the monochromatic tiling with tiles {(0,i)} and {(1, i), . . . , (n, i)} for 
each i as illustrated in the right part of Fig. [9j Now, let D\ and D2 be the 
probability distributions over the inputs x = (xi, X2) such that, for < i < n — 1 
and 1 < j < n, Pr Dl [(xi,x 2 ) = (0,i)] = Pr Dl [(x 1 ,x 2 ) = = ^f, 

Pr D2 [(xi,x 2 ) = (0,i)] = and Pr D2 [(x u x 2 ) = = for some small 

e > 0. Intuitively, any reasonable definition of PAR should imply that, for Di, 
P provides "bad" privacy guarantees (because w.h.p. it reveals the value of a^i), 
and, for D 2 , P provides "good" privacy (because w.h.p. it reveals little about 
x\). In sharp contrast, choosing g to be the probability mass results in the same 
average-case PAR in both cases. 

Other additive functions. In our definition of PAR and in the probability- 
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Figure 9: Maximal monochromatic regions (left) and protocol-induced rectangles 
(right) for an example showing the deficiencies of PAR definitions based on probability 
mass. 



mass approach, each input x in a rectangle contributes to g(R,x) in a way 
that is independent of the other inputs in R. Below, we discuss some natu- 
ral approaches that violate this condition, but we start by noting that other 
functions that satisfy this condition may be of interest. For example, taking 
g(R,x) = 1 + X) y e_R\x ^( x > y)> wnere d is some distance defined on the input 
space, gives our original definition of PAR when d(x, y) = 1 — £ Xiy and might 
capture other interesting definitions (in which indistinguishable inputs that are 
farther away from x contribute more to the privacy for x). (The addition of 
1 ensures that the ratio g(R I , x)/ g(R p , x) is defined, but that can be accom- 
plished in other ways if needed.) Importantly, here and below, the notion of 
distance that is used might not be a Euclidean metric on the n-player input 
space [0, 2 k — 1]". It could instead (and likely would) focus on the problem- 
specific interpretation of the input space. Of course, there are may possible 
variations on this (e.g., also accounting for the probability mass). 

Maximum distance. We might take the view that a protocol does not 
reveal much about an input x if there is another input that is "very different" 
from x that the protocol cannot distinguish from x (even if the total number of 
things that are indistinguishable from x under the protocol is relatively small). 
For some distance d on the input space, we might than take g to be something 
like 1 + max y6J {\j x ) d(y, x). 

Plausible deniability. One drawback to the maximum-distance approach 
is that it does not account for the probability associated with inputs that are 
far from x (according to a distance d) and that are indistinguishable from x 
under the protocol. While there might be an input y that is far away from 
x and indistinguishable from x, the probability of y might be so small that 
the observer feels comfortable assuming that y does not occur. A more re- 
alistic approach might be one of "plausible deniability." This makes use of a 
plausibility threshold — intuitively, the minimum probability that the "far away" 
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inputs(s) (which is/are indistinguishable from x) must be assigned in order to 
"distract" the observer from the true input x. This threshold might correspond 
to, e.g., "reasonable doubt" or other levels of certainty. We then consider how 
far we can move away from x while still having "enough" mass (i.e., more 
than the plausibility threshold) associated with the elements indistinguishable 
from x that are still farther away. We could then take g to be something like 
1 + max{do\Prn({y € R\d(y,x) > d®})/ Ptd{R) > t}; other variations might 
focus on mass that is concentrated in a particular direction from x. (In quantify- 
ing privacy, we would expect to only consider those R with positive probability, 
in which case dividing by Pru(R) would not be problematic.) Here we use 
Prjj(R) to normalize the weight that is far away from x before comparing it 
to the threshold t; intuitively, an observer would know that the value is in the 
same region as x, and so this seems to make the most sense. 

Relative rectangle size. One observation is that a bidder likely has a 
very different view of an auctioneer's being able to tell (when some particular 
protocol is used) whether his bid lies between 995 and 1005 than he does of 
the auctioneer's being able to tell whether his bid lies between 5 and 15. In 
each case, however, the bids in the relevant range are indistinguishable under 
the protocol from 11 possible bids. In particular, the privacy gained from an 
input's being distinguishable from a fixed number of other inputs may (or may 
not) depend on the context of the problem and the intended interpretation of the 
values in the input space. This might lead to a choice of g such as diamd(R) /|x|, 
where diam^ is the diameter of R with respect to some distance d and |x| is 
some (problem-specific) measure of the size of x (e.g., bid value in an auction). 
Numerous variations on this are natural and may be worth investigating. 

Information-theoretic approaches. Information-theoretic approaches 
using conditional entropy are also natural to consider when studying privacy, 
and these have been used in various settings. Most relevantly, Bar- Yehuda et 
al. [1] defined multiple measures based on the conditional mutual information 
about one player's value (viewed as a random variable) revealed by the protocol 
trace and knowledge of the other player's value. It would also be natural to study 
objective- PAR versions using the entropy of the random variable corresponding 
to the (multi-player) input conditioned only on the protocol output (and not 
the input of any player). Such approaches might facilitate the comparison of 
privacy between different problems. 
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